AITopics | frequentist regret

Collaborating Authors

frequentist regret

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Randomized Exploration for Reinforcement Learning with Multinomial Logistic Function Approximation

Neural Information Processing SystemsMar-21-2026, 13:14:55 GMT

artificial intelligence, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.41)

Add feedback

Thompson Sampling and Approximate Inference

My Phan, Yasin Abbasi Yadkori, Justin Domke

Neural Information Processing SystemsFeb-15-2026, 02:10:54 GMT

Neural Information Processing Systems http://nips.cc/

approximation, posterior, thompson, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
North America > United States > California > Los Angeles County > Long Beach (0.04)
North America > Canada (0.04)
(3 more...)

Genre: Research Report (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.84)

Add feedback

56cb94cb34617aeadff1e79b53f38354-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-12-2026, 06:00:53 GMT

dts, frequentist regret, revision, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.47)

Add feedback

Regret Bounds for Thompson Sampling in Episodic Restless Bandit Problems

Young Hun Jung, Ambuj Tewari

Neural Information Processing SystemsFeb-11-2026, 19:38:28 GMT

Restless bandit problems are instances of non-stationary multi-armed bandits.

artificial intelligence, bandit, data mining, (18 more...)

Neural Information Processing Systems

Country: North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology:

Information Technology > Artificial Intelligence (0.71)
Information Technology > Data Science > Data Mining > Big Data (0.55)

Add feedback

a89e938d2dde9cce748a26bf84627d1c-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 12:41:56 GMT

ensemble, linear ensemble, probability, (17 more...)

Neural Information Processing Systems

Country:

Asia > South Korea > Seoul > Seoul (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Data Science > Data Mining > Big Data (0.46)

Add feedback

Regret Bounds for Thompson Sampling in Episodic Restless Bandit Problems

Young Hun Jung, Ambuj Tewari

Neural Information Processing SystemsOct-2-2025, 11:24:06 GMT

Neural Information Processing Systems http://nips.cc/

bandit, data mining, machine learning, (19 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.94)
Information Technology > Data Science > Data Mining > Big Data (0.65)

Add feedback

Thompson Sampling and Approximate Inference

My Phan, Yasin Abbasi Yadkori, Justin Domke

Neural Information Processing SystemsAug-20-2025, 09:11:51 GMT

Thompson sampling (Russo et al., 2018) is a popular approach in bandit problems based on

approximation, posterior, thompson, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
North America > United States > California > Los Angeles County > Long Beach (0.04)
North America > Canada (0.04)
(3 more...)

Genre: Research Report (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.84)

Add feedback

Randomized Exploration for Reinforcement Learning with Multinomial Logistic Function Approximation

Neural Information Processing SystemsMay-27-2025, 08:17:17 GMT

We study reinforcement learning with _multinomial logistic_ (MNL) function approximation where the underlying transition probability kernel of the _Markov decision processes_ (MDPs) is parametrized by an unknown transition core with features of state and action. For the finite horizon episodic setting with inhomogeneous state transitions, we propose provably efficient algorithms with randomized exploration having frequentist regret guarantees. Here, d is the dimension of the transition core, H is the horizon length, T is the total number of steps, and \kappa is a problem-dependent constant. Despite the simplicity and practicality of \texttt{RRL-MNL}, its regret bound scales with \kappa {-1}, which is potentially large in the worst case. To improve the dependence on \kappa {-1}, we propose \texttt{ORRL-MNL}, which estimates the value function using local gradient information of the MNL transition model.

multinomial logistic function approximation, randomized exploration, reinforcement learning, (11 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.77)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.64)

Add feedback

Improved Regret of Linear Ensemble Sampling

Lee, Harin, Oh, Min-hwan

arXiv.org Machine LearningNov-6-2024

In this work, we close the fundamental gap of theory and practice by providing an improved regret bound for linear ensemble sampling. We prove that with an ensemble size logarithmic in $T$, linear ensemble sampling can achieve a frequentist regret bound of $\tilde{\mathcal{O}}(d^{3/2}\sqrt{T})$, matching state-of-the-art results for randomized linear bandit algorithms, where $d$ and $T$ are the dimension of the parameter and the time horizon respectively. Our approach introduces a general regret analysis framework for linear bandit algorithms. Additionally, we reveal a significant relationship between linear ensemble sampling and Linear Perturbed-History Exploration (LinPHE), showing that LinPHE is a special case of linear ensemble sampling when the ensemble size equals $T$. This insight allows us to derive a new regret bound of $\tilde{\mathcal{O}}(d^{3/2}\sqrt{T})$ for LinPHE, independent of the number of arms. Our contributions advance the theoretical foundation of ensemble sampling, bringing its regret bounds in line with the best known bounds for other randomized exploration algorithms.

ensemble, linear ensemble, probability, (17 more...)

arXiv.org Machine Learning

2411.03932

Country:

Asia > South Korea > Seoul > Seoul (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

Geometry-Aware Approaches for Balancing Performance and Theoretical Guarantees in Linear Bandits

Luo, Yuwei, Bayati, Mohsen

arXiv.org Machine LearningDec-30-2023

This paper is motivated by recent research in the $d$-dimensional stochastic linear bandit literature, which has revealed an unsettling discrepancy: algorithms like Thompson sampling and Greedy demonstrate promising empirical performance, yet this contrasts with their pessimistic theoretical regret bounds. The challenge arises from the fact that while these algorithms may perform poorly in certain problem instances, they generally excel in typical instances. To address this, we propose a new data-driven technique that tracks the geometric properties of the uncertainty ellipsoid around the main problem parameter. This methodology enables us to formulate an instance-dependent frequentist regret bound, which incorporates the geometric information, for a broad class of base algorithms, including Greedy, OFUL, and Thompson sampling. This result allows us to identify and ``course-correct" problem instances in which the base algorithms perform poorly. The course-corrected algorithms achieve the minimax optimal regret of order $\tilde{\mathcal{O}}(d\sqrt{T})$ for a $T$-period decision-making scenario, effectively maintaining the desirable attributes of the base algorithms, including their empirical efficacy. We present simulation results to validate our findings using synthetic and real data.

algorithm, frequentist regret, greedy, (13 more...)

arXiv.org Machine Learning

2306.14872

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
Asia > China (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.35)

Add feedback